Variant Discovery ◾ 149
variants. The low-impact variants are the synonymous ones that do not change protein
behavior; however, they may still have some effects. Variants on introns (intron-variants)
and variants in the downstream region of a gene are annotated as MODIFIER since they
affect non-coding regions, where prediction of the impact is difficult or there is no substan-
tial evidence of any effect.
To use snpEff, we should install snpEff software and download the database of interest.
Installation of SnpEff software requires Java V1.8 or later installed on your computer. The
installation instructions are available at “https://pcingola.github.io/SnpEff/download/” or
“http://pcingola.github.io/SnpEff/”. First, download the compressed folder using “wget”
Linux command and then decompress it with “unzip” command. You can also install the
binary.
wget https://snpeff.blob.core.windows.net/versions/snpEff_latest_
core.zip
unzip snpEff_latest_core.zip
This will create the directory “snpEff”, which contains two Java executable files with “.jar”
extension (snpEff.jar and snpSift.jar), a snpEff configuration file (snpEff.config), a license
file, and four subdirectories (“examples”, “exec”, “galaxy”, and “script”). By default, the
database will be stored in a subdirectory “data” in the “snpEff” directory but we can change
it by opening “snpEff.config” and editing “data.dir = ./data/” to the path that we need.
The database can be downloaded automatically and this is recommended but you can
also install it manually. For example, if you need to download the human database manu-
ally, you can run the following:
java -jar snpEff.jar download GRCh38.
Figure 4.11 shows the directory structure of the snpEff. If you installed the binary files, you
can use any of the executables without the “java” command as follows:
snpEff download GRCh38.
When you use snpEff, you must provide the right file path. For instance, if you are just one
step out of the “snpEff” directory, then you can run:
FIGURE 4.11 snpEff directory structure.